Adapting machine translation models toward misrecognized speech with text-to-speech pronunciation rules and acoustic confusability

نویسندگان

Nicholas Ruiz

Qin Gao

William Lewis

Marcello Federico

چکیده

In the spoken language translation pipeline, machine translation systems that are trained solely on written bitexts are often unable to recover from speech recognition errors due to the mismatch in training data. We propose a novel technique to simulate the errors generated by an ASR system, using the ASR system’s pronunciation dictionary and language model. Lexical entries in the pronunciation dictionary are converted into phoneme sequences using a text-to-speech (TTS) analyzer and stored in a phoneme-to-word translation model. The translation model and ASR language model are combined into a phonemeto-word MT system that “damages” clean texts to look like ASR outputs based on acoustic confusions. Training texts are TTSconverted and damaged into synthetic ASR data for use as adaptation data for training a speech translation system. Our proposed technique yields consistent improvements in translation quality on English-French lectures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation

We propose a novel technique for adapting text-based statistical machine translation to deal with input from automatic speech recognition in spoken language translation tasks. We simulate likely misrecognition errors using only a source language pronunciation dictionary and language model (i.e., without an acoustic model), and use these to augment the phrase table of a standard MT system. The a...

متن کامل

Cross-Word Arabic Pronunciation Variation Modeling Using Part of Speech Tagging

Speech recognition is often used as the front-end for many natural language processing (NLP) applications. Some of these applications include machine translation, information retrieval and extraction, voice dialing, call routing, speech synthesis/recognition, data entry, dictation, control, etc. Thus, much research work has been done to improve the speech recognition and the related NLP applica...

متن کامل

Pronunciation Modeling for Large Vocabulary Speech Recognition by Arthur

The large pronunciation variability of words in conversational speech is one of the major causes of low accuracy for automatic speech recognition (ASR). Many pronunciation modeling approaches have been developed to address this problem. Some explicitly manipulate the pronunciation dictionary as well as the set of the units used to define the pronunciations of words. Others model the pronunciati...

متن کامل

Joint pronunciation modelling of non-native speakers using data-driven methods

Modelling non-native speakers with different mother tongues is a difficult task for automatic speech recognition due to the large variation among speakers. One possibility for jointly modelling all speakers is to use the same speaker independent acoustic models and a joint lexicon to capture the variation. We have modified the reference lexicon using pronunciation rules that are derived in a to...

متن کامل

Dictionary learning: performance through consistency

We present rst results from our e orts in automatically increasing and adapting phonetic dictionaries for spontaneous speech recognition. Spontaneous speech adds a variety of phenomena to a speech recognition task: false starts [1], human and nonhuman noises [2], new words [3] and alternative pronunciations. All of these phenomena have to be tackled when adapting a speech recognition system for...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Adapting machine translation models toward misrecognized speech with text-to-speech pronunciation rules and acoustic confusability

نویسندگان

چکیده

منابع مشابه

Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation

Cross-Word Arabic Pronunciation Variation Modeling Using Part of Speech Tagging

Pronunciation Modeling for Large Vocabulary Speech Recognition by Arthur

Joint pronunciation modelling of non-native speakers using data-driven methods

Dictionary learning: performance through consistency

عنوان ژورنال:

اشتراک گذاری